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ATTENTION TO AND MEMORY FOR 
AUDIO AND VIDEO INFORMATION IN TELEVISION SCENES 



Abstract 

This study investigates whether selective attention to a particular television modality results in 
different levels of attention to and memory for each modality. Two independent variables manipulated 
selective attention. These were the semantic channel (audio or video) and viewers* instructed focus (audio < 
video). These variables are fully crossed in a within-subjects experimental design. Attention levels are 
investigated by measuring reaction times to cues in each modality (audio tones and color flashes). Memory 
questions ask about channel-specific contents. 

Both selective attention manipulations affect intensive measures of attention similarly. Because of 
this similarity, the modalities appear to ?ap a common pool of resources. Memory measures show a 
modality-specific effect. Visual information is remembered whether or not that information is important 
semantically, and whether or not subjects are instructed to focus on that channel. Audtory information, 
however, is better remembered when viewers were focused on the audio channel. Auditory information and 
auditory-based messages appear to demand greater resources than visual information and visual-based 
messages. 



ATTENTION TO AND MEMORY FOR 
AUDIO AND VIDEO INFORMATION IN TELEVISION SCENES 

Television contains l£Q channels of information •• audio and video. Some theorists suggest that the 
visual information on television hdja people to understand the verbal information presented (Oraber, 1990; 
Katt, Adoni & Parness, 1977; Lewis, 1984). Other theorists, however, propose that visual information 
interferes with people's ability 'o understand verbal information (Grimes, 1990a, 1991; Gunter, 1987; 
Robinson & Davis, 1990). Some theorists who believe in interference think that people miss information in 
the other channel. Other interference theorists think that people's information processing capacities are 
taxed by additional information. Believers in interference think that when people watch :hey cannot listen. 
Less frequently, they think that when people listen they cannot watch. 

An unanswered question is whether people process both channels of television information at once 
or whether they examine one at a time. Several studies of simultaneous multiple-channel processing have 
been attempted in psychology. Some of this research has investigated the processing of meaningful stimuli. 
In the 1950s, the locus of much of the research was how many channels of audio information people could 
handle at a time. Cherry (1953), for example, investigated the "cocktail party problem - how many 
conversations could be overheard simultaneously. The answer was, in most cases, only one (Broadbent, 1958; 
Cherry, 1953). This research suggests that only one source of meaningful or semantic information presented 
in a single modality can be understood at one time. 

Later psychological research has investigated simpler stimuli. Research has examined whether the 
detection of light flashes is helped or hindered by the presence of audio tones (Dornbush, 1968, 1970; 
Halpern & Lantz, 1974; Ingersoll & DiVesta, 1974; Lindsay, Taylor & Forbes, 1968; Triesraan & Davies, 
1973). The results of this research suggest that detection of non-meaningful or non-semantic signals can 
occur in iwa modalities at the same time. 

The results of this psychological research appear to indicate that people can inspect more than one 
channel, but may not be able to understand more than one. This assertion, however, is derived from distinct 
types of studies. Comparing them in this way confounds the complexity of the stimulus with the dependent 
measure that was ufod to assess "limitations." Specifically, detection studies use light flashes and tones. The 



effects are measured as differences in reaction times or accuracy. People can detect signals in multiple 
modalities as quickly and as accurately as in a single modality, The comprehension studies, however, use 
sentences and stories. The effects are measured as differences in learning or memory. People are generally 
able to detect simultaneous tones and flashes, but are not able to comprehend or remember messages from 
more than a single channel It is not clear whether the limitation is due to the increased complexity of the 
stimuli themselves, or in the different form of processing that is performed* This research will address the 
question of whether people can attend to and remember both modalities at once. 

Because television material is complex, these limitations may be imposed by a single channel system. 
The limitations may also be imposed by resource restrictions. The next section will examine psychological 
concepts and theories about people's restrictions in processing information. Next, we will examine how these 
restrictions can be applied to television and discuss previous communication research in this area. Specific 
hypotheses will be formulated. The experiment that was used to investigate this question and the results of 
this experiment will be then be discussed. In the final section, the conclusions that can be drawn from this 
experiment will be discussed. 
Theories about people 

The process by which people process information can be envisioned as a series of stages (Basil, 
1991; Craik & Lockhart, 1972; Hsia, 1971; Wickens, 1984; Woodall, Davis & Sahin, 1983). This individual 
stage approach to processing may help organize the literature, findings, and current thinking to develop 
definitions of audience and media appropriate to psychological theories. 

A general overview of these stages was discussed by Craik and Lockhart (1972) as the "depth of 
processing." Specifically, people analyze stimuli at several levels. The first level involves sensory analysis of 
the material. Preliminary analysis is concerned with lines, angles, brightness, pitch, or loudness. If only the 
first stage occurs, this is described as "shallow processing." The second level involves analysis for meaning. 
Information is compared to abstractions from past learning. Information is enriched or elaborated 
upon. When the second stage of analysis occurs, Craik and Lockhart describe this as "deeper processing." 
They propose that deeper processing leads to better memory. The third level involves memory itself. 



Memory is the storage and retrieval of information. Information is stored as a "memory trace" (Craik & 
Lockhart, 1972). These processes are illustrated in Figure 1. 



FIGURE It 
Stages of information processing 



S«naory 




Attention 




Sma&tic 






Pxoccssi&g 






Procoaing 




Mtoory 



This experiment will concentrate on witentional limitations. Conceptualizations of this process and 
how it could bear on audiences' comprehension of television messages is discussed in the next section. 
Attention 

The notion of attention is based on the observation that people are only able to handle a limited 
amount of informal'™ at a time (Broadbent, 1958; Cherry, 1953). Various mechanisms have been proposed 
to account for how people manage incoming information. The history of attention research may provide 
insights into the nature of this process. 

Early research conceptualized attention to be a structural filter (Broadbent, 1958; Cherry, 1953). 
According to structural models, the process of attention is a precursor to a sequence of information ' 
processing events (Craik & Lockhart, 1972; Harris, 1983; Norman & Bobrow, 1975). A filter selects 
information for further processing. Informatior that is not attended to cannot be processed further. For 
example, messages that are not attended will t jt be remembered (Broadbent, 1958). A filter model is shown 
in Figure 2. 



FIGURE 2i 
Structural Model of Attention 
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Researchers, however, noticed that subjects can detect their own name even when it is presented in a "non- 
attended" channel (Moray, 19S9). Unattended information is sometimes remembered. Unattended 
information, then, appears to be admitted into the processing system (Triesman & Riley, 1969). Because 
unattended information appears to receive some awareness, later theorists have come to believe that there is 
not necessarily a particular filter (Allport, 1989; Wickens, 1984). Instead, they propose that attention is a 
process of resource allocation (Kahneman, 1973). Resources are allocated to particular channels of incoming 
information. Those channels that receive the most resources are processed to a greater extent (Navon & 
Gopher, 1979, 1980). An illustration of a resource model is shown in Figure 3. 



FIGURE 3: 
Resource Model of Attention 
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Attentional resources, however, are limited (Kahneman, 1973). Because of the importance of these resource 
limitations, these models are referred to as resource or capacity models of attention. Resource models 
propose that the allocation of resources to sensory channels can also depend on the nature of the task, the 
stimulus, ' r its relevance (Kahneman, 1073). 
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People's limitation in processing simultaneous stimuli, then, has been explained in two ways. 
Structural theorists feel that limitations are determined by the processing system architecture of the brain. 
They believe, for example, that a filtering mechanism limits processors to one channel of arriving information 
at a time. Resource theorists, however, do not believe that limitations are imposed by the architecture of the 
brain. Instead, they believe that limitations are determined by limited resources (Kahneman, 1973; Navon & 
Gopher, 1979, 1980). 

This study conceptualizes attention according to both models of attention. First, the s election of 
information is described as "focusing." Selective attention or focusing can be controlled by the viewer 
(Anderson & Lorch, 1983). It depends on the desires of the viewer and on aspects of the message (Drew & 
Grimes, 1987). Two operationalizations of selective attention are presented in the method section. The 
second conceptualization of attention is the level of reso u rces ,. The allocation of resources can also be 
described as the "level of attention." Resources are allocated by the viewer. Whether this allocation can be 
changed by conscious effort on the part of the viewer is not known. Kahneman (1973) believes it can. Using 
Kahneman's conceptualization, attention level may also be specific to particular modalities (Wickens, 1980). 
The attention level, then, is the activation level of a particular sensory modality - auditory or visual. 
Attention levels vary over time. Attention levels can range from none to some maximum attentional 
Implications 

The presence of multiple stages of processing suggests competition for processors' focus and 
resources (Burris, 1987; Swets, 1984; Wickens, 1984). It is not clear whether the resources for a particular 
modality of television information are reduced by a second modality (Grimes, 1991). It is possible that 
sufficient resources are available for processing both modalities. It is also possible, however, that limitations 
in the information processing system restrict viewers' abilitv to glean information from two modalities 
simultaneously. For example, viewers may focus on one rrodality and ignore information in the other 
modality. 

If there are limitations on processing, they could be as a result of two potential causes. First, 
limitations may be structural. People simply may not be shk to examine two channels simultaneously. 
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Second, limitations may be a result of resource or capacity limitations. People may be able co examine both 
channels simultaneously, but only if that information is simple. 

Previous research 

Previous research using memory measures 

Several studies of the extent to which video information affects the understanding of audio 
information also have been done in the field of communication. Most of these studies rest on the structural 
filter models of selection to irfer attention. If memory is good, then attention must have been high 
(Broadbent, 1958; Triesman, 1960). These theories suggest that television viewers can receive one channel or 
the other. As a result, researchers usually only measure comprehension or memory for information in one of 
the two channels. Most of these studies investigated memory for information in the audio channel (Burriss, 
1987; Dornbush, 1968,1970; Drew & Grimes, 1987; Edwar&on, Grooms & Pringle, 1976; Edwardson, 
Grooms & Proudlove* 1981; Edwardson, Kent <& McConnell, 1985; Edwardson, Kent, Engrstrom & 
Hofmann, 1991; Hoffner, Cantor & Thorson, 1989; Hsia, 1968a, 1968b; Ingersoll & DiVesta, 1974; Kaiz et \ 
a/., 1977; KisieUus & Sternthal, 1984; McDaniel, 1973; Warshaw, 1978; Young & Bellezza, 1982). 

The results of this line of research demonstrate that in about half of these studies the piesence of 
visuals i nterferes with understanding of the audio content (Burris, 1987; Drew & Grimes, 1987; Edwardson et 
al. t 1985, 1991; Hoffner et al. t 1989; Son, Reese & Davie, 1987; Warshaw, 1978). 

The other half of the studies, however, have found that visuals djj jjol interfere with memory for 
audio information (Dornbush, 1968, 1970; Edwardson et a!., 1981; Gunter, 1980; Ingersoll & DiVesta, 1974; 
Katz et a/., 1977; Kisielisu & Sternthal, 1984). When the visual information complements ihe audio 
information, comprehension can even be enhanced (Drew & Reese, 1984; Findahl, 1981; McDaniel, 1973; 
Reese, 1984). These findings conflict not only with the previous findings, but with the filter model of 
attention. Two studies, however, examined memory for both audio and video information. Drew and 
Grimes (1987) compared memory for redundant or conflicting audiovisual information in news stories. They 
observed that when exposed to conflicting information, viewers "attend to the video at the expense of the 
audio" (1987: 459). Unfortunately, these studies have ^ramined only the effects of visuals on auditory 
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comprehension and memory, If interference occurs, what, happens to people's memory ft r the visual 
information? Five studies have examined memory measures for both audio and video information. Pezdek 
and Hartmaa (1983) distracted children with either a visual distractor (a toy) or an auditory distractor (a 
record playing). They found that such distractions were modality specific. 1'hat is, a visual distraction 
interfered only with memory for visual information and an auditory distraction interfered only with memory 
for audio information. These results suggest that attention and memory is indeed modality specific 

Pezdek and Stevens (1984) compared memory for auditory and visual information under four 
conditions: match, mismatch, video-only, and audio-only. The results showed memory for information was 
the same in the matched condition as in the single-modality condition. In the mismatched condition, 
however, "processing the audio iaibrmation suffers mo.e than does processing tiw video information" (p. 
217). Difference between recognition and comprehension scores suggest that the findings do not represent 
limitations in processing, but in the selection of channels. 

Similarly, Drew and Grimes (1987) compared memory for redundant or conflicting audio-visual 
information in news stories. They ooserved that when exposed to conflicting information, viewers "attend to 
the video at the expense of the audio" (1987. 459). When faced with two channels of conflicting information, 
the viewer's filter selected the auditory channel. This finding shows evidence of a single channel system but 
is contrary »o other research that shows visual dominance (Posner et at« 1976). 

Another study conducted by Grimes (1990b) examined the possibility of information from one 
channel creeping into the other channel. This research is similar to several audits in psychology (e.g., Loftus 
& Palmer, 1974). Grimes found that viewers occasionally translated auditory information into visual 
"memories." When faced with complementary channels of television news information, it appears that audio 
information was more likely to affect visual memory than the reverse. The results are interpreted to suggest 
that viewers have a single code for memory. If that is the case, however, it is not clear why audio memory 
wasn't as affected by pictures. 

Newhagen (1990) also examined both auditor and visual memory for two types of television news 
scenes. He manipulated the presence or absence of compelling visuals and measured memory. People were 
shown to have poorer memory for the audio information when accompanied by compelling visuals. The 
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results appear to demonstrate that people were overwhelmed by these attention-demanding features and 
were unable to dedicate sufficient processing to understand the message. Some of these effects! however, 
may have been attributable to changes in the primary modality of processing. The compelling nature of the 
visual stimuli may have caused subjects to switch from the Audio channel, which contained the majority of the 
semantic information! to processing the video channel In this case, the nature of the stimulus may have 
determined the primary processing channel. The results - that compelling visual material caused poorer 
comprehension and memory for audio information - are compatible with either explanation. It is not clear 
whether what occurred was an overload in capacity, or the selection of visual information to the exclusion of 
the auditory information. Other potential explanations for the variability in results will be discussed below. 

These studies have attempted to use memory-based measures to draw inferences about attentional 
limitations. Recall and other memory measures, however, are limited in their ability to identify attentional 
limitations for five reasons. First, most studies use varying levels of redundancy between the two channels 
(e.g., Drew & Grimes, 1987; Grimes, 1990a, 1990b). In cases of redundancy, viewers could have received the 
information from either channel (Severin, 1967). These studies, then, are not always comparing instances of 
two sources of information. Instead, they are comparing instances of receiving one message split over two 
channels with receiving one message in two independent channels. Instead, they are comparing instances of 
receiving one message split over two channels with receiving two distinct messages. This approach will not 
necessarily tell us whether a single channel is selected or whether both channels are examined (Grimes, 
1990b). Second, memory measures provide little information about the nature of the limitation. The 
limitations may be while watching and listening or in later processing. For example, Kahneman's (1973) 
model suggests that allocating attention reduces other remaining resources. As a result, it is difficult to 
determine whether receivers are unable to attend to both channels or whether their resources do not allow 
the processing of both channels (e.g., Hsia, 1968b; Jester, 1968; Swets, 1984; Travers, 1970). Third, the 
natures of the visuals themselves also affect memory. For example, shocking visuals may interfere with 
comprehension of the audio channel (Newhagen, 1990). Therefore, finding poorer memory for specific 
information does not necessarily indicate that memory problems are caused by attention difficulties. Instead, 
they may be caused by difficulties with the nature of the content. Fourth, interest or motivation levels are 
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allowed to vary. For example, visuals can improve learning when they generate greater interest (Edwardsou 
et 1981; Katz et a/., 1977). Better knowledge or comprehension may not be attributable to content, but 
more effort. Fifth! the presence of visuals may operate by distracting viewers from the audio channel 
Distraction appears to occur only when the visuals are interesting (Edwardson et a/., 1976), Distraction, 
strictly speaking, is not evidence of cognitive limitations. These five explanations, then, suggest other factors 
that may determine memory independent of attention, and therefore, suggest that memory is not a 
reasonable surrogate for attention. 

After considerable research, then, it is still not clear whether it is visual features or visual content 
that interferes with audio comprehension. After considerable research, it appears that there is not an all-or- 
nothing filter that selects visual information at the expense of auditory information. It is still not clear, 
however, whether the presence of activity in the other modality changes viewers* selective attention or 
whether complexity of that information interferes with viewers' comprehension of auditory material. 

Another difficulty is the reliance on news stories for this research. Television news stories generally 
contain the majority of their semantic information in the auditory channel (Barkin, 1989; Grimes, 1991). 
Additional visual information is usually added to the auditory message. If visual information were the basis 
of a message, however, it is not clear whether audio features or content would interfere with visual 
comprehension. Therefore, we have no information applicable to other television genres. 
Previous research using attention measures 

The other factors that affect memory suggest that the examination of attentional limitations would 
benefit from measurement of attention to each channel directly. The process of attention allocation to 
television information has been subject to empirical investigation in communication. These studies have 
investigated gsneial attention to the audio and video channels (Geiger & Reeves, 1989, 1991; Grimes, 1990; 
Mcadowcroft & Reeves, 1989; Meadowcroft & Watt, 1989; Reeves & Thorson, 1986; Reeves et at. % 1985, 
1991; Thorson et «/., 1985, 1987; Watt & Meadowcroft, 1990; Wartella & Ettema, 1974). Specifically, 
research has examined the amount of attention people pay to complete messages or the amount of effort that 
the entire message demands. 
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Kahneman's (1973) proposal that resources can be shifted between modalities, suggests that 
attentional resources may be modality-specific . People may be able to process information and recognize 
cues in a modality only to the extent that attentional resources are available in that modality. This proposal 
suggests that each modality has its own qttentional resource level (Eysenck, 1982; Kahneinan, 1973; Wickens, 
1980). As a result, one way of studying attentional limitations is to examine modality-specific attention levels. 
Research to date has not examined whether resources arc shared or are specific to each modality (Eysenck, 
1982). Because of the dual-channel nature of television, the study of competition and for resources is 
important (Wickens, 1984). 

Prior research has not explicitly examined whether resources are specific to each modality (Eysenck, 
1982). If, however, attentional mechanisms share a common resource pool, allocating attention to one 
channel may reduce the attention available to the other channel. Because of the dual-channel nature of 
television, the study of competition for attention is important (Wickens, 1984). 

There may be two reasons why little research has followed up the question of modality-specific 
attention. First, psychologists typical!) Investigate processing effort for single-channel tasks such as reading, 
or memorial nonsense syllables. In these instances, the secondary tasks are assumed to be slowed as a 
result of processing effort. Secondary reaction times are a measure of resources left over from processing. 
Reaction time in either channel should provide similar results. 

The second reason modality specific effects may not have been measured is that psychologists usually 
assess effort in modalities that do not interfere with the presentation of material. For example, audio tones 
are used to assess reading difficulty (Britton, Glynn, Meyer & Penland, 1972). This research compares 
reaction times while reading simple and complex passages. It can be observed, however, that single-modality 
resu* ' * sometimes counter-intuitive. For example, Britton et al. (1972) found that people responded 
faster to tones while reading complex passages, and more slowly to tones with simple passages. The 
explanation posed was that simple passages used cognitive capacity to a greater extent. It may also be 
possible, however, that when faced with complex passages, people "borrowed" attentional resources from 
their auditory channel for semantic processing. The other explanation is that secondary reaction times 



10 

.3 

ERIC 



measure arousal. Determining which of these effects occur.;, however, would require measuring attention in 
both channels. 

Although several studies have investigated secondary reaction task times to multiple-clwnnel 
presentations, at the time this study began, none had explicitly examined modality-specific selective 
attentional effects (Grimes, 1990a). Inspection of these results, however, can provide insights into whether 
selective attention effects may be occurring. Further analysis suggests that although there may be a general 
attention factor, there also appears to be evidence of selective attention. These data can be interpreted as 
demonstrating that secondary reaction times respond as a measure of modality-specific attentional resources. 
These studies will be reviewed here briefly. 

A series of experiments was conducted by Reeves, Thorson & Schleuder. The first of these reports 
(Reeves, Thorson & Schleuder, 1985) showed that multi-channel presentations resulted in significantly slower 
secondary reaction times to cues than single-channel presentations. When viewers were required to split 
their attentional resources among two modalities, they had less attentional capacity available. Therefore, 
they took longer to respond to secondary task cues. It appears that dual-channel processing depletes 
resources more than a single-channel. This finding is consistent with either a selective attention filter or a 
common pool of shared resources. 

The second published study (Thorson, Reeves & Schleuder, 1985) consisted of three experiments. 
The first experiment looked at how auditory message Complexity'* affected secondary reaction times to an 
audio tone and memory. When they only listened, subjects responded more quickly while listening to simple 
messages than complex ones. This suggests that the resources available to cope with the audio cue were 
decreased by complex audio. When they only watched, the effect was in the o pposite direction. When 
subjects were presented with both channels of information, however, there was no difference in responding to 
simple or complex audio messages. These results appear compatible with modality-specific resources. 
Specifically, viewers may have been able to shift attentional resources from unused modalities - for example, 
while listening, from the visual to the auditory modality. 

The third experiment in this report (Thorson et al.> 1985) used a visual cue - a strobe light mounted 
behind the subject. Visual complexity did not affect attention levels. The strobe light cue, however, may 
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have caused a startle reaction on the part of subjects (Reeves, personal communication, 1991). This may 
have been due either to the light's intensity, the nature of the cue, or its spatial location. Startling subjects 
could have washed out possible effects. The comparison of secondary reaction times across cue channels 
supports this interpretation - reaction time to the visual flash remained near baseline. The results for video 
complexity were in the direction that would be predicted. Subjects in the video-only condition teftded to 
respond more slowly to visually complex messages than to visually simple messages* While the researchers 
noted that "modality of the secondary task interacts with channel condition" (p. 448), they did not suggest 
how modality differences might have altered the observed secondary reaction times. If the results are 
interpreted with respect to modality-specific resources, they are compatible with modality-specific attentional 
effects. It appears that information consumes attentional resources specific to its particular modality. Visual 
information uses visual resources, and auditory information uses auditory resources. 

The data from the Thorson et al. (1985) study were reanalyzed according to micro (local) and larger 
(global) measures of complexity (Thorson, Reeves & Schleuder, 1987). An interesting conclusion suggests 
that the video modality may not be as limited as the audio modality; that is, video processing may require 
fewer resources than audio processing. Unfortunately, this research confounded the modality with the nature 
of the content. Specifically, audio complexity was operationalized as a "count of propositions" (1985: 434) 
and video complexity as scenes that "contained many edits, scene changes [etc.]" (p. 434). In this case, audio 
complexity, by the nature of the operationalization, measured semantic complexity while video complexity 
measured uon-semantic complexity. Therefore, we cannot be sure if the observed differences are due to the 
modality or the nature of the information. Some forms of information require more effortful processing than 
other forms (Triesman, 1988)* Semantic information may require more effort than non-semantic 
information. We know that audio semantic information taxes resources. We do not know, however, whether 
it is the semantic nature of the information or its auditory presentation that taxes resources. The differences 
between the auditory and visual modalities should be investigated further. This would cross the nature of the 
information (semantic, non-semantic) with the modality in which the information is transmitted 

In the fourth report of the Reeves et al research, Schleuder, Thorson & Reeves (1988) compared 
the effects of time compression (none, 120%, and 140%) with secondary reaction times to cue in two 




channels. Time compression yielded slower reaction times to auditory cues for compressed messages, but 
faster reaction times to tactile cues for compressed messages. Because there was no independence between 
the manipulation of video and audio complexity, it is not clear which of these two complexity differences may 
affect attention levels. The results were interpreted as indicating two competing processes ~ 
modality-specific interference combined with an increase in general arousal. They commented that 
"experiments using secondary reaction time measure[s] as an index of attention should incorporate three 
modalities auditory, visual, and t?ctile....Each modality competes for visual and auditory processing 
resources differently 1 * (Schleuder et al. f 1988, p. 22). Again, this research suggests that resources may be 
specific to particular modalities. 

Five other studies are relevant to the examination of modality-specific resources. The first two were 
conducted by Geiger and Reeves (1989; 1991). This research examined the resource demands of television 
editing. In these studies, two types of edits were compared - semantically velated and semantically 
unrelated. Semantically unrelated cuts were expected to show evidence of greater resource demands. The 
results showed significantly slower reaction times to audio cues after semantically unrelated cuts. An 
alternative explanation for these results, however, is that the visual discrepancy caused viewers to shift more 
mental energy to their visual modalities. Viewers may have selectively attended to their visual modality 
instead of their auditory modality. This would have made them slower in responding to audio cues. This 
explanation cannot be ruled out with the existing data. It could, however, be investigated by crossing 
semantically related and semantically unrelated cuts with the modality of secondary task cues. 

Grimes (1990a) conducted a study of audio-video correspondence (redundancy) on secondary 
reaction times and memory for news stories. In this study, auditory (10,000 hertz tones) an** visual (color 
bars) cues responded similarly. High and low correspondence between the audio and video channels resulted 
in slower reaction times than moderate levels of correspondence. This was significant only for the visual 
probes, however. Grimes proposed that the relationship between secondary reaction times and memory 
demonstrate competition between these two modalities. That is, he proposed that these results suggest that 
each modality competes for the same attentional resources. 
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Basil & Melwaai (1991) conducted a secondary analysis of reaction times to secondary cues* These 
results show that reaction times to the auditory cues were slowed by the presence of people on the screen* 
Interactions also appeared in that these effects were lessened when music was present. While people's 
reactions were slowed by the presence of particular visual information, their reactions were quickened by the 
presence of other audio information. This finding is compatible with the notion of resource shifts — namely, 
that viewers shifted resources to channels with more information. Again, this result is compatible with 
selective attention effects and the possibility of a common pool of resources that is shifted between the 
auditory and visual modalities. 

Subsequent to undertaking this study. Grimes (1991) examined modality-specific attention to 
television scenes. He examined the effect of varying levels of cross-channel redundancy in television news 
stories on attentional resources. The results suggest that there may indeed be a common pool of resources 
that is shifted among modalities. Specifically, while watching audio-based television news stories, subjects 
appear to have shifted additional attentional resources to their auditory modality. This study only examined 
attention effects, however; therefore, we have no information on whether processing resources or memory 
differences are also modality specific. 
Implications for research 

Research that uses memory measures to assess cognitive resource limitations is inconclusive. A 
variety of factors affect memory independent of resource limitations. In addition, resource limitations may 
be different at various stages of processing. When attention is examined through secondary reaction task 
times, however, the results do not suggest an all-or-nothing filter. Instead, they appear to support theories of 
a common pool of resources that are shared between modalities. These results suggest that we should 
measure effects at multiple stages of processing. Studying these outcomes simultaneously would allow us to 
identify the nature and location of these limitations in processing multiple channels of information. 

One approach to the question of resource limitations has been used in psychology, but has not been 
explored in communication. It involves presenting viewers with both channels of information, but asking 
them to attend selectively, or focus, on a particular channel (Schneider et <*/., 1984). This manipulation could 
be used to examine whether the human information processing system handles single or multiple channels. 
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If people can attend to only a single modality, they would miss information in the other modality. If, 
however, people process multiple modalities simultaneously! then they should still glean information in the 
unfocused channel. This investigation can measure the effects of selective attention at two distinct stages. 
First, we can examine the possibility of modality-specific attention. Second, we can examine the possibility of 
modality-specific memory differences. 

Manipulating viewers' focus on a particular channel of information could result in three specific 
outcomes. First, according to structural models, if viewers can only attend to one modality at a time, then 
they will only be able to detect and remember information in that channel. They will miss information in the 
unfocused channel. Second, according to a resource model, if television viewing uses modality-specific 
resources, then focusing on one channel will increase the attentional resources available to that modality at 
the expense of the other modality. Viewers will not miss the other information, but will be less able to 
detect flashes or remember information in the unfocused modality. Third, according to the multiple stage 
model, the location of these resource limitations may not be at the attention stage, but at the semantic 
processing stage. If this is the case, then allocating more resources to one channel will not enhance attention 
to that channel but will enhance memory for information in that channel. Viewers will be able to detect 
information in both channels, but have better memory for the focused channel. 
Hypotheses 

Because television contains both auditory and visual information and people have limited information 
processing abilities it is expected that viewers make use of selective attention to focus on a particular 
channel. First, viewers should shift attentional resources based on the location of the semantic information 
in a message. Viewers should pay more attention to the semantic channel than to the non-semantic channel. 
This leads to the following prediction: 

HI: When the semantic information is in the audio channel, subjects will show more attention to 

the auditory modality, however, when the semantic information is in the video channel, subjects will 

show more attention to the visual modality. 
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Second, viewers should shift attentional resources according to instructions, Viewers should show 
more attention to the channel of instructed focus than to the opposite channel This results in the following 
prediction; 

H2; When instructed to focus on the audio channel, subjects will show more attention to the 
auditory modality; however, when instructed to focus on the video channel, subjects will show more 
attention to the visual modality. 

Third, viewers should shift semantic processing resources based on the location of the semantic 
information in a message, Because of the dedication of additional resources, a semantic channel effect 
should result in better memory for information in the semantic channel than for information in the 
non-semantic channel This leads to the following prediction; 

H3: When the semantic information is in the audio channel, subjects will show better memory for 

auditory information; however, when the semantic information is in the video channel, subjects will 

show better memory for visual information. 

Fourth, viewers should shift semantic processing resources according to instructions. This instructed 
focus effect should cause better memory for information in the channel of instructed focus than to 
information in the opposite channel This leads to the following prediction; 

H4: When subjects are instructed to focus on the audio channel, they will show better memory for 

auditory information; however, when subjects are instructed to focus on the video channel, they will 

show better memory for visual information. 



This experiment used a two-by-two, fully-crossed, within-subjects design to investigate 
modality-specific attention and memory. Two independent variables were used to create selective attention 
to a particular channel. The first independent variable is the channel containing the semantic content (audio 
or video). The second independent variable is the channel in which viewers were instructed to focus (audio 
or video). The first dependent variable, attention, was investigated in two modalities (auditory and visual) by 
measuring viewers 1 secondary reaction time to modality-specific cues (audio tones and color flashes). The 
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second dependent variable, memory, was investigated by asking paper-and-pencil multiple-choice questions 
about channel-specific content. 

To insure that most of the variance in secondary reaction task times and memory was due to 
experimental differences, with minimal noise from external factors, this research took place in a controlled 
laboratory setting. Subjects were instructed to "watch the television with your full attention/ The necessity of 
responding to the secondary task cues also insured a high level of attention. Secondary reaction times for 
messages were assessed while viewing. Memory was measured after viewing. 
Opera tionalizations 
Selective attention 

One of the most important aspect of information processing and attention discussed so far is 
selectivity - what is "attended to" (Broadbent, 19S8). Selective attention to specific channels of television 
information, however, probably depends on both the contents of the message and the desires of the viewer 
(Anderson & Lorch, 1983; Collins, 1982; Geiger, 1988; Watt, 1979). Either of these two factors may 
determine whether viewers focus on a particular channel of television information (Salomon, 1972, 1974). 
Both of these operationalizations are explained below. 

Semantic channel 

Channel focus can be an attribute of a message . Viewers may focus on a particular channel based 
on what is in the message. One message factor that may affect channel focus is the location of the plot 
information or meaning. If this information is in a particular channel, viewers may be more likely to devote 
more effort to that channel (Triesman, 1964, 1968). Semantic or plot information can result in a focus on 
that channel because of its importance to the viewer (Collins, 1982; Salomon, 1979). 

In television scenes, semantic information can be rar/ied in either the audio or video channel. This 
is possible by finding television scenes where the semantic i iformation is in a particular channel Specifically, 
there are instances in which either the audio or video channel contains the bulk of the semantic information. 
Documentaries, for example, often contain the semantic information in a narrative M dio track complimented 
by visual images. Chase scenes, however, use visuals to carry the story, and are ,d by sound effects 
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in the audio channel. This study will operationalize scenes as containing semantic information in one oi the 
two channels. 

The semantic channel was identified in the following way. Messages were selected that had a 
dominant semantic channel This selection was based on finding scenes that were comprehensible by only 
hearing them and scenes that were only comprehensible by seeing them. Scenes that were only 
comprehensible with the audio channel were "audio-semantic/ Scenes that only convey the story through 
video were "video* semantic" In this way, scenes contained semantic information in either the aud ; o or video 
track. The semantic channel was a categorical variable - audio or video. 

Instructed focus 

Channel focus can also be an attribute of a viewer (Collins, 1982; Salomon, 1979, 1983). For 
example, the desire to focus on a particular channel may depend on viewer interest. This contention is based 
on evidence .hat viewers can switch between particular channels of information. One demonstration of this 
effect can be seen in the "figure-ground" effect (Kahneman, 1973). For example, an ambiguous drawing can 
be seen as a vase or a face at various points. It cannot, however, be seen as a vase and a face 
simultaneously. Although selective attention does not completely exclude information in other modalities, 
information in the secondary modality appears to be processed at a less completely. For this study, it was 
important to control for and examine the effects of selective attention. This experiment, therefore, 
manipulated viewers* focus directly. Receiver focus, then, is operationalized as a categorical variable - 
audio-focused or video-focused. 

Viewers' focus was manipulated by asking viewers to attend to the audio or video channel. This 
manipulation has been used previously (Dornbush, 1970; Katz, Adoni & Parness, 1977). The experimenter 
will ask the subject to either "concentrate on the audio material - the words and sounds" or to "concentrate 
on the video material - the pictures." 
Attention 

Attentional resources are widely believed to be limited (Kahneman, 1973). Through these 
limitations, a person's performance of a task reduces the amount of resources remaining for other tasks 
being performed concurrently (Kahneman, 1973). As more effort is devoted to a primary task, less is 
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available for the second task. This study used secondary reaction times to measure attention. While people 
are engaged in the primary task of watching television, they are timed on an occasional secondary task. 
Secondary reaction times are the interval that elapses between a cue and the person's response (Geiger & 
Reeves, 1989, 1991; Reeves & Thorson, 1986; Reeves et al, 1991; Thorson et «/., 1986). The latency to 
response is compared across a sampling of different types of television material. Reaction times to visual 
cues and auditory cue were measured for each scene. These measures were then averaged to represent 
means for particular conditions. Secondary reaction time, then, is a ratio-level variable that can be seen as 
modality-specific Reaction times were a continuous interval measure compared across different television 
scenes. 

Secondary reaction times were assessed in both the auditory and visual modalities. This was done 
through the use of both auditory and visual cues (Grimes, 1990; Schleuder et al. t 1988; Thorson, et al., 1986). 
The auditory cue was a 1000-Hertz tone lasting for 100 milliseconds. It was fired by computer. Tones were 
played through the television monitor at comfortable listening levels. The visual cue has consisted of both 
strobe flashes (Thorson et al., 1986) and color flashes on the screen (Grimes, 1991). The visual cue was an 
orange flash on the screen. It consisted of four frames of solid color edited onto the videotape (lasting for 
133 milliseconds). Response latency to these cues was measured by computer. 
Memory 

This study used the recognition of information as a measure of memory. The ability to recognize 
information from memory, of course, rests on the parsing and storage of that information and the ability to 
retrieve it from memory (Kellermann, 1985). For this research, memory was measured as recognition 
accuracy for audio, and video components of scenes (Grimes, 1990). Although memory is conceptualized as a 
continuum (from none to complete), the measures are probabilistic samples of this continuum. Memory, 
then, is a ratio-level accuracy measure compared within subjects but across television scenes. This study 
examined cued recall in the form of multiple choice question. Questions asked about information specific to 
a particular channel. Visual questions, for example, asked about what happened, and how people were 
dressed. Auditory questions asked about what was said and the background music. 
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Unit of Analysis 

There is not intrinsic "container size" for theories about information processing (Reeves, 1989). 
Previous research, however, suggests that people "chunk" information into meaningful bits (Carter, Ruggels, 
Jackson & Heftier, 1977), According to these and other results, people can chunk thirty-second segments 
such *. advertisements as psychological "units." 

In this study, the unit of analysis was a thirty-second "scene." This unit of analysis was encouraged 
by the use of discreet segments of programs tasting 28 to 33 seconds. Each was separated by 5 seconds of 
black. The selective attention manipulations caused the semantic channel and the instructed channel focus to 
vary between scenes. Attention was averaged over 30-second scenes to represent attention levels for scenes. 
Memory was also averaged to represent recognition levels for scenes. So thirty-second scenes were 
meaningful u <ts for theories about message factors ^uch as the contents of a channel, and viewer factors 
such as chanx< 1 focus, level of attention, and memory. 
Subjects 

Twenty-four summer school students at a large Western University's Mass Media Institute were 
recruited to take part. They had come from various locations around the United States for special summer 
school courses on the mass media. All participation was voluntary. The entire procedure took less than 
one-half hour (approximately 25 minutes). Subjects ranged from 16 to 47 years of age. Twelve were women 
and twelve were men. 
Stimuli 

A variety of television scenes that used either the audio or video channel to carry the semantic 
information were sampled. Sampling allowed us to insure a variety and range of naturally-occurring 
messages and correlated factors (Jackson & Jacobs, 1983; Jackson, O'Keefe & Jacobs, 1988; Morley, 1988a, 
1988b; Reeves & Geiger, in press). This variety of messages included a range of genres (Levy, 1989; 
McLeod & Pan, 1989; Reeves, 1989). To select the stimuli for this study, the following steps were taken. 
First, six genres were selected. 1 Three of these - news, interviews, and documentaries - represented 

1 This sample was baaed on a survey of 10 Ph.D. students who rated 18 genres on their information action, 
and emotion level. The genres which were rated as high on information were news, interviews, and 
documentaries. The genres rated as high on action were animation, action, and crime. All six genres were 
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generally audio-based genres* The other three animation, action, and crime - represented video-based 
genres. Programs were recorded from actual television broadcasts. These programs were then viewed for 
content. When a scene was incomprehensible without one channel of information, it was selected for 
pretesting. Two raters verified these ratings. The first rater listened to the scenes and tried to identify what 
the message was about. A second rater both watched and listened to the messages and tried to identify 
"whether the audio or visuals are most important to conveying the story." Both raters were used for 
classification of the experimental scenes. A list of these scenes is presented in Appendix A. Two alternate 
orders of the stimulus tape was made. These tape orders are shown in Appendix B. 

Previous research has shown that audio and video complexity may affect secondary reaction times 
(Reeves et ai. f 1985; Thorson, et a/., 1985, 1986). Therefore, stimuli were selected which used a range of 
audio and video complexity. These measures can be seen in Appendix B. 

Location of cues. Previous research has discovered that both local and global complexities can affect 
reaction times to secondary cues (Thorson et al. % 1986). Local complexity refers to what is occurring in the 
scene at that particular moment. Global complexity refers to what is happening more generally. This 
research concentrated on the global complexity dimension for two reasons. First, scenes were the unit of 
analysis. Second, it would be preferable to avoid smaller factors that might add noise to this level. 

For these reasons, the location of each secondary task cue was carefully placed according to four 
criteria. First, one cue was placed in the first 15 seconds and the second in the last 15 seconds. Second, to 
avoid the effects of production factors such as cuts (Geiger & Reeves, 1989, 1991; Kim, Hawkins & Pingree, 
1991), cues were not placed within three seconds of a cut. Third, cues were placed at natural breaks or 
pauses in the audio channel. Fourth, whether the first cue was auditory or visual was based on a random 
number table. An alternative version of the tape was made which used the opposite order. To insure that 
subjects could not anticipate the occurrence of secondary task cues, seven experimental scenes contained a 
third cue of random modality. The sequences of secondary task cues can be seen in Appendix B. 




equivalent on rating of emotion level. 
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Procedure 

Subjects were run individually. Each subject was welcomed to the lab, the experimenter introduced 
himself, and the subjects were seated in front of the television. They were given the general instructions and 
shown a short practice tape to acquaint them with the secondary task tones and flashes as well as the 
procedure. When they had become proficient in responding to both tones and flashes, the practice material 
was stopped. The subjects were asked if they had any questions or problems. 

Next, subjects were given the first manipulation. They were asked to concentrate on the video or 
audio material, and told what type of questions they would be asked for information in that channel 
Subjects were not told that they would be asked about information from the non-focused channel. Subjects 
then watched eleven scenes. The first two scenes provided practice for the focus manipulation and were not 
included in later analyses. This segment lasted for approximately 7 minutes. (The orders are shown in 
Appendix B.) The experimenter left the room. He returned at the end of the sequence and asked how it 
had gone and whether there were any problems or questions. The other manipulation was then given (to 
concentrate on the audio or video material, and what type of questions would be asked). Subjects then 
watched nine more scenes that lasted for approximately 6 minutes. The experimenter then left the room. 
He returned at the end of the sequence. The subjects were then given the memory questionnaire. This 
questionnaire contained 96 multiple choice question to test recognition accuracy for all segments. When they 
finished the questionnaire, subjects were debriefed, asked if they had any questions, thanked, and shown out. 
Analysis 

The effects of semantic channel and viewers' instructed focus on secondary reaction time were 
investigated. These effects were investigated in both modalities. For the first step, data were plotted. Non- 
responses and outliers that were more than 4 standard deviations from the mean were removed. This 
deleted 31 of 1320, or 3% of the cases. 

The effects of semantic channel and viewers 9 instructed focus on memory were also investigated. 
Data were subjected to statistical analysis with a within-subjects analysis of variance procedure. The .05 level 
of significance was used for all comparisons. 
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Results 

Manipulation check 

Four self-report measures were used as manipulation checks for the instructed focus. In the first, 
viewers were asked, "When I told you to focus on the video material - the pictures were you able to?" 
Twenty-one of the twenty-four subjects (88%) indicated that they were able to "focus" on the video material 
as instructed. All twenty-four were used in the analyses. For the second manipulation check, subjects were 
asked "How easy was it to focus on the yjdsfl material?" They were provided a l-to-7 scale labelled "very 
easy" to "very hard." The average response was 2.4, nearer the "easy" end. For the third manipulation check, 
subjects were asked, "When I told you to focus on the audio material - the words and sounds, were you able 
to?" Twenty-three (96%) indicated that they were able to focus on the audio material as instructed. For the 
fourth manipulation check, subjects were asked "How easy was it to focus on the audio material?" They were 
provided a l-to-7 scale labelled "very easy" to "very hard." The average response was 3.0, nearer the middle 
of the scale than to "easy" end. 

All four viewers' self-reports suggested that they were able to focus on particular message channels. 
Viewers also reported that it was fairly easy for them to focus on a particular message channel. These 
results suggest that the viewers' focus was successfully manipulated. 

The results, however, indicate that it may have been easier for subjects to focus on the video channel 
than the audio channel. A paired t-test examined subjects' report of the difficulty of focusing. Their report 
of the difficulty of focusing on the video channel (A/ =2.4) was significantly easier than their report of 
focusing on the audio channel (A/=3.0) (/[23]=2.17,p<.05). 
Secondary reaction time 

Secondary reaction times were analyzed with a repeated-measures analysis of variance. No 
significant disordinal interactions were found. Before examining the hypotheses, three main effects need to 
be discussed. 

Subjects responded more quickly to visual cues (A/ =399 msec.) than to auditory cucs (A/=421 msec) 
(7^1,1090]= 24.4, p<. 001). This difference in the speed of responding to these two cues does not reflect 
more attention to the visual modality than to the auditory modality. Instead, it is due to detection levels. It 
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was easier to detect visual cues than auditory ones. As a result, people were faster to these visual cues than 
to these auditory cues, 3 Further analyses, therefore, compared reaction times to each type of cue separately. 

Hypothesis 1 predicted that subjects would be faster to audio cues in audio-bast J scenes and video 
cues in video-based scenes than to opposite-modality cues. This was not found (F<1). Instead, subjects 
responded more quickly to both auditory and visual cues in audio-semantic scenes than in video-based scenes 
(FIl,10901=9.49,p<.01). The average secondary reaction time to auditory cues in audio-based scenes was 
415 msec, and in video-based scenes was 428 msec. (p<.05 by Tukey procedure). The average secondary 
reaction time to visual cues was 391 msec in audio-based scenes and 407 msec in audio-based scenes 
(p<.05). This result can be seen in Figure 4. As predicted, people are faster at detecting cues in the 
auditory modality when the semantic information is in the audio channel. However, they are also faster at 
detecting cues in the visuil modality when the semantic information was in the audio channel The results of 
the semantic channel conflict with Hypothesis 1. 
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'Evidence for this assertion can by seen in comparing the second pretest of the experiment with the final 
results. In the pretest, responses to the 3-frame visual ci ; were slightly faster than responses to the auditory 
cue. When the cue was lengthened to four frames* subjects became faster in responding to visual cues. In both 
cases, however, subjects were faster for audio-based scenes than for video-based scenes. Thus, the overall effect 
remained the same, even though the main effect for baseline reactions seems to depend on the specifics of the 
visual cue that was used. 
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Hypothesis 2 predicted that subjects would be faster to detect cues in the auditory modality when 
focused on the audio channel and in the visual modality when focused on the video channel This was not 
found (F< 1). The result, shown in Figure 5, also conflicts with the predicted effect for instructed focus. 
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Subjects were faster in detecting cues when focused on the audio channel than when focused on the video 
channel (F\l, 1090]= 4.41, p< .05). The average secondary response time to auditory cues when focused on 
the audio channel was 416 msec, and when focused on the video channel was 425 msec (p<.05). The average 
secondary response time to visual cues when focused on the audio channel was 396 msec and when focused 
on the video channel was 406 msec. (p< .05). 

The results across these two variables were surprisingly similar. The variables, however, were 
completely independent. The semantic channel variable, for example, alternated randomly among the 24 
scenes. This can be seen in Appendix B. Meanwhile, instructed focus was manipulated only twice - one 
channel for the first 12 scenes, and the opposite focus for the last 12 scenes. The location of these two 
manipulations is also shown in Appendix B. Therefore, the variables plotted in Figure 4 are completely 
uncorrected with each of the manipulations plotted in Figure 5. 



25 

9k 



These results suggest that regardless of the selective attention variable examined - semantic channel 
or instructed focus subjects were faster at detecting secondary task cues when these variables attempted to 
focus them on the audio channel of television material. 3 

Memory 

Memory measures were analyzed with two separate repeated-measures analyses of variance. Two 
specific hypotheses were investigated. 

Hypothesis 3 predicted that memory would be better for audio material in audio-based scenes and 
for video material in video-based scenes. This was not found (F<1). Instead, subjects showed better 
memory for visual questions, as can be seen in Figure 6 (7^1,23]* 34.6, /><. 001). 
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Questions about visual information were easier for subjects. In addition, subjects showed better memory for 
both types of material when semantic information was in the video channel (F\l 9 23]= 16.4, p<. 001). Subjects 
correctly identified an average of 18.7 video questions when the semantic information was in the audio 
channel versus 20.9 when the semantic information was in the video channel (p<.05). Similarly, subjects 
correctly identified 15.7 audio questions when the semantic information was in the audio channel, but 18.0 
when the semantic information was in the video channel (p<.05). 



^Further evidence can be seen in both pretests of this experiment. In the first pretest, responses to auditory 
secondary-task cues were faster for audio-based messages than for video-based messages. In the second pretest, 
responses were faster for the audio-based messages for both cue modalities. Both pretest results, then, are 
consistent with the direction found in the final experiment. 
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Hypothesis 4 also predicted that subjects would show better memory for the material on which they 
were instructed to focus. They were expected to show better memory for audio material when focused on 
the audio channel, and for video material when focused on the video channel. This effect was significant 
(F[l,23]=6.6,/><.02). It is illustrated in Figure 7. 
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Along with this result, there was also an effect for modality of questions - visual questions being easier than 
auditory questions (F{1,23] ■ 35.9, p<. 001). In addition, an effect for instructed focus emerged. Subjects 
showed better memory when they were instructed to focus on the audio channel (F[l,23] = 10.2, /x.OOl). 
This effect should be interpreted in view of the significant modality of material-by-instructed focus 
interaction. Little effect was seen on the video accuracy. Specifically, subjects correctly identified 19.5 video 
questions when focused on the audio channel and 19.2 when focused on the video channel (n.s.). The effect 
was seen for the audio questions, however. For the audio questions, subjects correctly identified 17.7 
questions when focused on the audio channel, but only when focused on the video channel (p<.05). 

These results suggest that memoiy was significantly affected by both variables. Viewers' memory 
appears to depend on where the viewer is focused. For the semantic channel variable, memory was better 
for audio and video information when the scenes contained most of the semantic information in the video 




channel. With the instructed focus manipulation, however, memory for audio was better when subjects were 
instructed to focus on the audio channel. 4 

Discussion 

This experiment demonstrates that selectively focusing on a particular channel of television has 
several significant effects on measures of attention and memory. The results indicate differences in the 
nature of audio-based and video-based messages and the processing of auditory and visual information. Each 
of the hypotheses, and the implications of the results, will be discussed in turn. 
Reaction time 

Hypothesis 1 predicted that, because of selective attention, people would demonstrate more attention 
to the channel that contained the semantic information. Subjects were expected to respond more quickly to 
auditory cues when that channel contained the semantic information and to visual cues when that channel 
contained the semantic information. This was not found. Instead, the results show that under these 
laboratory conditions, subjects had faster reactions to secondary task cues when the semantic information is 
in the audio channel. 

Hypothesis 2 predicted that people would show more attention to the channel in which they were 
instructed to focus. That is, when subjects were focused on the audio channel they were expected to respond 
faster to audio cues, and when they were focused on the video channel they were expected to respond faster 
to video cues. This was not the case. Instead, viewers were able to detect cues in both channels, regardless 
the channel on which they were instructed to focus. Again, subjects had faster reactions to both types of 
secondary task cues when they were instructed to focus on the audio channel. 

The attention results have two important implications. First, because viewers were not any faster at 
detecting cues in the semantic or instructed focus modality, it suggests that attention is not modality-specific. 



4 Further analyses were conducted. These analyses investigated potential relationships between attention and 
memory measures. Analyses averaged across subjects to obtain mean reaction times to particular scenes and 
mean memorability scores. Small correlations were obtained. The correlation between reaction time and visual 
memory was positive (.14). The correlation between reaction time and auditory memory was negative (-.13). 
The reciprocal nature was intriguing. Further analyses, however, showed that the relationship was too weak to 
be statistically significant. 
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Instead, people appear to monitor both channels of television for salient events and cues. Viewers 
perceptual system does not appear to be limited to a single channel All modalities appear to gain some 
admittance into the processing system* 

Second, regardless of whether the cue is auditory or visual, responses are faster to audio-based 
messages and when subjects are instructed to focus on the audio channel The auditory and visual systems, 
therefore, appear to have similar resources available at the same time. This appears to be evidence that at 
the detection level resources are not shifted to specific channels. Detection of these types of cues, then, does 
not benefit from a fouis on that channel These responses may be an automatic, and may occur as a pre- 
attentional sensory-level response (Neisser, 1967). 

The results also show that these two modalities of television do not use resources equally. 
Secondary reaction times were faster for audio-based scenes. Secondary reaction times were also faster 
when subjects were instructed to focus on the audio channel Because this effect occurred for both the 
semantic channel variable and the instructed focus manipulation, the results suggest that this effect is quite 
robust. It is widely believed that processing of the audio channel of television is more difficult than 
processing the video channel Also, viewers reported that it was more difficult to focus on the audio channel 
than the video channel Responses to secondary tasks were faster while viewers were engaged in difficult 
material than when they were engaged in easy material Faster reaction time to cues in difficult material is 
consistent with previous findings. For example, Britton et at. (1982) found subjects responded faster to audio 
cues in complex material than in simple material. Thorson et al. (1985) also found subjects faster to 
secondary tones for complex auditory information than for simple auditory information. 

The secondary reaction time result? varrant interpretation. Some researchers purport that 
secondary reaction times are as a measure of attentional demands (e.g., Britton et a/., 1982; Sperling & 
Dosher, 1986; Thorson et a/., 1985). These researchers assume that capacity is taxed to the point of reaching 
its limitations (Norman & Bobrow, 1975). Slower reaction times, therefore, are believed to indicate more 
difficult material (Reeves & Thorson, 1986; Reeves et at. 9 1985; Schleuder et al. 9 1988; Thorson et a/., 1986). 
Other researchers, however, use the same secondary tasks as a measure of attentional allocation (e.g., 
Kahneman, 1973). If (a) attentional limitations are not always taxed to capacity ?y television material, or (b) 
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responses to secondary cues are automatic, and not bogged down by processing loads, then it seems quite 
possible that secondary reaction times measure attentional resource allocation by the viewer. As this study 
proposed, these results appear to indicate that this measure appears to measure resource allocation. This 
result is consistent with theories of arousal and autonomic activation (Kahneman, 1973; Wickens, 1984), It 
validates the suggestion that "the secondary task measure may...capture automatic responses associated with 
arousal" (Reeves et a/., 1991: 692). 

The conclusion that secondary reaction times respond as a measure of attentional allocation is 
consistent with quite a few results in both communication and psychology (Basil & Melwani, 1991; Britton et 
al. f 1972; Meadowcroft & Reeves, 1989; Mitchell & Hunt, 1989; Reeves & Thorson, 1986; Reeves et a/., 1985, 
1991; Schleuder et a/., 1988; Thorson et a/., 1987; Watt & Meadowcroft, 1990). These results suggest that 
secondary reaction time measures do not measure attentional or sensory-level processing limitations, but 
measure processing resources . Secondary reaction times are not slowed for difficult material - they are 
faster. Secondary reaction times also do not appear to benefit from resource allocation to specific channels 
through selective attention. The Interpretation appears to be that more resources are available for difficult 
material. Additional resources appear to allocated when necessary. Attention, then, does not appear to be 
operating at resource-limited levels (Norman & Bobrow, 1975) for television viewing. 

Another potential explanation for this effect warrants mention. This explanation is based on psycho- 
biological theories of evolution (Posner et al. f 1976). Posner et ol. proposed that the precedence of visual 
processing makes sense phylogenetically. They proposed that the brain is biased toward the reception of 
visual stimuli. Although these results may go beyond that theory, they appear to conflict with it. When 
subjects are focused on audio information, they responded more quickly to both visual and auditory cues. 
Posner might have predicted the opposite result. 

From a survival-of-the-species perspective, however, it may make sense that people respond more 
quickly to changes in the environment when focused on auditory stimuli. When they are not monitoring the 
visual environment for signs of danger, the nervous system may compensate by shifting additional resources 
to the attentional system. Danger detection system may involve both channels of attention. So when people 
are not looking the autonomic nervous system may be enhanced in a general sense to compensate. 
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Memory 



The memory results show evidence of modality-specific effects* Most importantly, there appears to 
be evidence of resource limitations in the auditory modality* These conclusions will be discussed in terms of 
the original hypotheses. 



Hypothesis 3 predicted a semantic channd effect such that viewers would have better memory for 
information in the modality that contained most of the semantic information. This hypothesis was not borne 
out. Instead, viewers had better memory for both audio and video information when the semantic 
inf ormation was in the video channel 

Hypothesis 4 predicted that viewers would show better memory for material in the channel on which 
they were instructed to focus. This hypothesis was partially borne out. The effect occurred over a 
background of better memory for the video questions. Viewers had, however, better memory for audio 
information after they were instructed to focus on the audio channel. 

The results for the two memory measures show limited evidence of modality-based differences. 
Subjects were more accurate in responding to questions about audio and video content in visual-based 
scenes. This result probably indicates that video-based television scenes such as chase scenes are easier than 
audio-based scenes such as news stories. This result is also consistent with research that suggests that video 
information processing requires fewer resources (or less effort) than auditory information (Colvatio, 1974; 
Schleuder et a/., 1988). This may be due to the different form of the code systems (Salomon, 1979). Video- 
based codes may be easier than audio-based codes. 

When viewers were focused on the audio channel, however, memory occurred for audio information 
was better. This finding was in accord with Pezdek and Stevens (1984). Four potential theories have been 
posed so far. First, there was the possibility of structural filters. Focusing on a specific modality may 
determine what is accepted for central processing. The results, however, are not consistent with this theory. 
Specifically, viewers were able to remember auditory information when they were focused on the video 
channel. 

The second potential theory was general resource limitations. Processing semantic auditory 
information may require considerable allocation of either sensory-level or semantic-level resources. In this 




way, attending to the audio channel may lead to more effort overall -« more overall resource allocation. 
When these resources are directed to the auditory channel, this may result in better memory for the items. 
The results did not, however, show improved memory for visual information in these instances. Instead, the 
visual information was remembered equally well regardless of the channel focus. 

A third theory was that more attention may be required to remember information than to attend to 
it. Perhaps attention is not as limited as are processing resources (Coivatia, 1974; Posner et al^ 1976). The 
results are only somewhat consistent with this theory. Specifically, visual memory was neither helped nor 
hindered by a visual focus. Auditory memory, however, £35 enhanced. This result suggests that visual 
processing is not enhanced by the presence of additional resources. Such a finding is consistent with 
previous research that compared visual and auditory processing. Visual processing is not the same as 
auditory processing. As Kahneman (1973, p. 135) commented, "It is tempting to speculate that the modern 
study of attention would have taken a different course if Broadbent (19S8) had been concerned with how one 
sees dances rather than how one hears messages.** 

A fourth theory revolves around the possibility that visuals require fewer processing extrapolations 
than audio (Cohen, 1973). This may be due to the greater immediacy of visual symbols than language-based 
auditory messages (Salomon, 1979). Visual information may access meaning more directly. If auditory 
information requires more stages of processing, it may benefit more from additional resources than would 
visual information. This could show up as better memory. 

An alternative reason for the results is that the concentration on the audio channel may have 
encouraged a different form of processing. When watching both channels of information, viewers may have 
processed the information only at a sensory level. It may have been stored to memory temporally or 
episodically. When viewers focused on the audio channel, however, they may have processed this information 
semantically. This deeper meaning-level processing may have led to better auditory memory as Craik and 
Lockhart proposed (1972). Therefore, the focus instruction may have inadvertently changed the nature of the 
processing. Although this seems unlikely, it is not possible to eliminate this explanation for these effects at 
this time. 
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Overall 

Comparing these two dependent variables yields results that are compatible with one another and 
with other studies. In general, the results suggest that, at the attentional stage, resources are shared between 
modalities and are not modality-specific* Memory results, however, show a benefit for audio information 
when focused on the audio channel. 

How is it possible that viewers were not better at detecting audio cues when they were focused or 
the audio channel, but had better memory for the audio information? The general conclusion that can be 
drawn is derived from two more specific conclusions. 

Specific conclusion #1 . First, attention is not modality-specific Detection of information appears to 
occur in both channels simultaneously, regardless of the viewers selective attention* The evidence for this 
assertion can be seen in the first two figures. In Figure 4, secondary reaction times were faster to audio- 
based scenes. In Figure 5, reaction times were faster when viewers focused on the audio channel. This 
suggests that responding to information at a basic level uses a common pool of resources. Because 
instructions to focus on particular channels did not affect the attentional detection of secondary task cues, 
this detection appears to occur automatically (Neisser, 1967; Shiffrin & Grantham, 1974). 

Specific conclusion #2. The processing of visual information in television scenes is not equivalent to 
processing auditory information. The evidence for this assertion can be seen in the last two figures. In 
Figure 6, viewers show better memory for both types of information in visual-based scenes. In Figure 7, 
viewers show better memory for auditory information when they are focused on the audio channel These 
two figures show results which indicate that video information does not benefit from being in the channel of 
instructed focus to the extent that auditory information does. 

The results show that processing of auditory television material is enhanced by focusing on that 
channel. The first potential explanation is that human information processors have a structural filter after 
sensory processing and before semantic processing (Deutsch & Deutsch, 1963). In this scenario, only one 
channel of information can N get into" the processing system and be processed semantically* Visual 
information, however, shows evidence of being detected and processed equally well regardless of viewers* 
focus. Specifically, visual information is detected and remembered, regardless of the instructed focus. The 
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enhancement of auditory memory was not at the expense of information in the other channel Specifically, 
auditory information is not at the expense of visual information* This result, therefore, contradicts the 
structural theory. 

The second potential explanation is resource theory. Some of these theorists predict that auditory 
processing makes significantly greater resource demands than does sensory processing (Kahneman, 1973). In 
this experiment memory measures show that viewers were able to process and remember information in 
either channel. Viewers were able to remember visual information even when focused on the audio channel. 
However, there was a slight increase in audio performance when viewers focused on the audio channel If 
additional resources were dedicated to this channel, and auditory information requires more resources than 
processing visual information, the results of this experiment support a resource model of processing. 

Colavita (1974) found similar results. That is, visual stimuli take precedence over auditory stimuli. 
Visuals may require fewer extrapolations than audio (Cohen, 1973). Salomon (1979) proposed that this may 
be due to the greater immediacy of visual symbols than language-based auditory symbols. Visual information 
may access meaning more directly. If auditory information requires more stages of processing) it seems 
consistent that it would require additional resources. Because auditory information requires these additional 
resources, it would show a decrement before visual information does. Even with limited resources, visual 
information can be processed 

These results suggest that it is not attention that is the limiting factor in comprehension of television 
information. Instead, comprehension appears to suffer from limitations in comprehension, understanding, or 
memory at some later stage. Similar to results in the field of psychology, monitoring of channels does not 
tax resources to the extent that comprehension of audio semantic information does. It is the process of 
understanding of auditory information on television that requires resources. 

Overall, these results show that television viewers process both modalities simultaneously. They 
automatically detect audio and video cues - even when these cues occur along with information that is 
peripheral to the plot and not their main focus. Subjects also remember details of television scenes whether 
or not the information is semantically relevant and whether or not they are instructed to focus on that 
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channel. That is, viewers $re able to remember audio and video information, regardless of the channel on 
which they are focused. 

The results of this experiment may not be directly generalized to natural viewing situations. They 
suggest, however, that television viewers appear to process two channels of information at once. Viewers 
glean visual and auditory information from television at the same time whether they intend to or not. This is 
consistent with results that indicate television viewers often are affected by visual and other non-verbal 
information while processing auditory information. Viewers use this information to form conclusions about 
news events and political candidates (e.g., Garramone, 1983; Grimes, 1990b), So while television users may 
gain less semantic information than users of other media such as newspapers, they are also g ainin g non- 
semantic information. Television viewers can take quite a bit of visual information away from the viewing 
experience. Audio-presented information, however, benefits from a concentration or a focus on that channel. 
Viewers need to be focused on the audio channel to take the most away from audio-based television news 
programs. Meanwhile, visual inforn ation appears to affect viewers more directly and immediately. 

The finding that audio memory is enhanced by focusing on the audio channel is consistent with 
research on news programs. News programs usually contain most of the semantic information in the audio 
channel (Grimes, 1990a, 1991). It should not be surprising, therefore, that novel or inconsistent visual 
interferes with comprehension or memory for audio information (e.g., Edwardson et al. f 1985, 1991; Grimes, 
1990a; Newhagen, 1990; Son w. ., 1987). In these instances the novel or inconsistent visual is interfering 
with the audio focus. Visuals reduce the resources available to the auditory modality, however briefly. 

Finally, these results suggest that this multi-stage model and multi-measurement method can be used 
to investigate the attentional allocation and memory for television programs. Because of the complexity of 
television stimuli, H Our noisy, hard-to-control stimuli may actually place in high relief the versatility of the 
human-information processing system 1 * (Grimes, 1990: 25). This method is likely to be useful in determining 
the nature of these processes and their limitations. These methods and measures, then, provide insights into 
not only how people process television information, but about how they process information generally. 
Further research can work toward determining which aspects of programs lead to arousal. We can also learn 
which aspects can lead to resource limitations, and which aspects lead to better memory for audio or visual 
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information* In this way, the intricacies of the human information processing system can be related to 
dimensions of television stimuli. This research, then, can lead to a better understanding of the processing of 
not only television material, but also real-world multi-channel sources of information. In this way, this 
research will lead to the study of day-to-day information processing. 
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Appendix A: 
DESCRIPTION OF STIMULI 



Shots Sentences 

VIDEO-BASED: 
Animation 

1. Disney - Bee conducting 4 0 

2. Smurfs Going to picnic 12 5 

3. Chipmunk girls -- Faking asleep 7 4 

4. The Simpsons •- Bart cheats 14 3 

Adventure 

1. Lonesome Dove - Burial 8 1 

2. Gabriel's Fire - Finding a tape 2 9 

3. Vietnam movie -- Getting even 7 6 

4. Lonesome Dove -- Into sunset 1 3 

Crime or Detective 

1. Magnum, P. I. - Fight scene 7 12 

2. Sledge Hammer ~ Visions 6 4 

3. Crime Story -- Searching 7 0 

4. McGuiver Being whipped 8 6 



AUDIO-BASED 
New$ 

1. CNN - Iraqi refinery 6 5 

2. NBC -- Swartzkopf returns 3 5 

3. CNN - El Salvador Peace talks 11 6 

4. KTVU - Homeless veterans 1 6 

Taik or Interview 

1. Oprah - Disputed phone bill 2 8 

2. Psychologist - Looking for Dad 4 4 

3. Phil Donohuc « Virgins 8 7 

4. Johnny Carson ~ Jay Le ao 4 13 

Documentary 

1. Winds of Everest - Climbing 5 5 

2. History of TV- First broadcast 10 4 

3. Sinatra ~ The war years 9 6 

4. Jaws: The true story 4 6 
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Appendix B: 
TAPE ORDERS 



Tape #1 

Number Type Name Secondary Cues 

Warm-up 

1. A. Indiana Jones 1 A V 

2. V. Indiana Jones 2 V A V 

Focus manipulation #1 

1. V. Newsl V A 

2. A. Talk 1 A V 

3. A. Documentary 1 AAV 

4. V. Animation 1 A V 

5. V. Crime 1 V V A 

6. V. Adventure 1 A V 

7. A. News 2 A V 

8. V. Adventure 2 V A A 

9. A. News 3 A V 

10. A. Documentary 2 V V A 

11. V. Crime 2 V A 

12. V. Crime 3 A V 

Focus manipulation #2 

13. V. Crime 4 A V « V 

14. A. Documentary 3 A V 

15. V. Animation 2 V A 

16. A. Talk 2 V A 

17. V. Animation 3 A V 

IS. A. Talk 3 V A V 

19. A. Documentary 4 A V A 

20. V. Adventure 3 V A 

21. A. News 4 V A 

22. V. Adventure 4 A V 

23. A. Talk 4 A V 

24. V. Animation 4 A V 



» 
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Appendix B (Conk): 



TAPE ORDERS 



Tape #2 

Number Type Name Secondary Cues 

Warm-up 

1. A. Indiana Jones 1 V A 

2. V. Indiana Jones 2 A V A 

Focus manipulation #1 

1. V. Crime 4 V A A 

2. A. Documentary 3 V A 

3. V. Animation 2 A V 

4. A. Talk 2 A V 

5. V. Animation 3 V A 

6. A. Talk 3 A V A 

7. A. Documentary 4 V A V 

8. V. Adventure 3 A V 

9. A. News 4 A V 

10. V. Adventure 4 V A 

11. A. Talk 4 V A 

12. V. Animation 4 V A 

Focus manipulation #2 

13. A. News 1 A V 

14. A. Talk 1 V A 

15. A. Documentary 1 V V A 

16. V. Animation 1 V A 

17. V. Crime 1 AAV 

18. V. Adventu~ 1 V A 

19. A. N* — * V A 

20. V. . e2 A V V 

21. A. V A 

22. A. Doc .ary 2 AAV 

23. V. Crime 2 A V 

24. V. Crime 3 V A 



39 



Appendix C: 



SAMPLE QUESTIONNAIRE 

SUBJECT 



SAMPLE QUESTIONS 



There are two parts to this questionnaire. First, I have a few questions about the experiment in general. 
Second, there are several questions about what appeared in the television scenes. This will take about IS 
minutes. 

Please respond by circling or writing in your answer. Thank you 



1. When I told you to focus on the video material - the pictures - were you able to? 

YES (1) 
NO (0) 



2. How easy was it to focus on the video material? 

VERY VERY 
EASY DIFFICULT 
1 2 3 4 5 6 7 



3. What made it easy to focus on the video material? 



4. What made it difficult to focus on the video material? 



5. When I told you to focus on the audio material - the words and sounds - were you able to? 

YES (1) 
NO (0) 

6. How easy was it to focus on the audio material? 

VERY VERY 

EASY DIFFICULT 

1 2 3 4 5 6 7 
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In the report on the refinery in Baghdad, the damage was a result of; 

a. clandestine operations 

b. repeated bombing 

c. Kurdish sabotage 

d. retaliation by Kuwaitis 

The refinery has been restored to 

a. 25% of capacity 

b. 50% of capacity 

c. 75% of capacity 

d. 100% of capacity 

In the story men were: 

a. inspecting plans 

b. filling trucks with oil 

c. welding pipes 

d. drilling new wells 

What was the man who was interviewed wearing? 

a. blue hard hat 

b. traditional 

c. suit and tie 

d. military uniform 



In the story on 900 numbers, the man's business phone was billed 

a. $500 

b. $700 

c. $1000 

d. $1200 

In a letter from a credit collection bureau, he was told, 

a. in 24 hours they would turn off his phone 

b. in 24 hours they would prosecute him 

c. 48 hours, they would turn off his phone 

d. 48 hours, they would prosecute him 

The man was on the 

a. Phil Donohue Show 

b. Geraldo 

c. Good Morning America 

d. Oprah Winfrey Show 

Fs was or had: 

a. clean shaven 

b. a mustache 

c. a beard 

d. shoulder-length hair 
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